For this project, I decided to look at vehicle mortality in each state. Each state has its own laws and regulations when it comes to vehicle safety. Each state also has its own unique terrain and culture. I got the data from the CDCWonder Database and looked at all vehicle crash deaths from inside the vehicle. This data does not include pedestrians. In the figures below, average vehicle mortality is calculated for each age group in each state as well as overall mortality for each state. Those numbers are then multiplied by 1000000 so that we may view the numbers with ease as many of the numbers are quite small.
Before the project started, I thought that states with the biggest cities, such as New York, would have the highest vehicle mortality rates because there is such limited space for cars, causing them to be packed closely together. However, if we look at the table below, the state with the highest overall mortality is Mississippi, which is considered to be a rural state. In fact, the top 10 states are all fairly rural states.
table<-hux(meancars1[order(-meancars1$overall_mortality),])%>%
add_colnames() %>%
set_bold(row = 1, col = everywhere, value = TRUE) %>%
set_all_borders(TRUE)
table[-2,]
| State | 1 | 1-4 | 5-14 | 15-24 | 25-34 | 35-44 | 45-54 | 55-64 | 65-74 | 75-84 | 85+ | overall_mortality |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mississippi | 19.5 | 14.8 | 84.5 | 69.7 | 63.5 | 61.2 | 66.7 | 85.8 | 143 | 190 | 79.8 | |
| South Dakota | 15.3 | 73.4 | 52.7 | 45.1 | 34.5 | 47.6 | 63.7 | 114 | 125 | 63.4 | ||
| New Mexico | 12.2 | 10.9 | 63.3 | 47.3 | 40.9 | 39.8 | 37.5 | 46.4 | 90 | 125 | 51.3 | |
| Wisconsin | 5.58 | 7.32 | 57 | 37.9 | 29.3 | 32.9 | 36.7 | 51.9 | 102 | 133 | 49.3 | |
| Idaho | 11.1 | 10 | 61.4 | 33.5 | 34.5 | 34.5 | 36.3 | 56.6 | 90.9 | 122 | 49 | |
| Iowa | 9.33 | 10.4 | 53.3 | 33.6 | 31.6 | 36 | 35.5 | 47.9 | 107 | 121 | 48.5 | |
| Nebraska | 10.9 | 56.9 | 30.8 | 32.2 | 34.8 | 29.7 | 48.2 | 90.9 | 93.8 | 47.6 | ||
| Kansas | 10.8 | 7.78 | 56.6 | 33.9 | 32.6 | 29.8 | 32.3 | 44.9 | 96.3 | 106 | 45.1 | |
| Maine | 6.13 | 38 | 32.6 | 31.6 | 21.6 | 26.4 | 42.1 | 90 | 98.6 | 43 | ||
| Montana | 8.32 | 51.2 | 38.5 | 36.7 | 30.3 | 34.6 | 43.8 | 59.5 | 83.5 | 42.9 | ||
| Florida | 10.4 | 5.43 | 5.91 | 46.8 | 34.1 | 27.8 | 26.4 | 26.2 | 30.5 | 63.1 | 91.5 | 33.5 |
| North Dakota | 27 | 17.7 | 20.2 | 17.7 | 17 | 35.6 | 51.1 | 62.5 | 31.1 | |||
| Oregon | 6.66 | 6.91 | 37.1 | 22.8 | 18.6 | 24.6 | 25.2 | 32 | 58.7 | 71.8 | 30.4 | |
| North Carolina | 8.34 | 7.31 | 6.4 | 36.8 | 26.8 | 22.4 | 21.8 | 22.7 | 29.8 | 64.9 | 70.3 | 28.9 |
| Tennessee | 13.2 | 4.57 | 5.69 | 40.1 | 27.4 | 26.5 | 24.5 | 21.7 | 30.4 | 57.6 | 57 | 28.1 |
| Missouri | 5.12 | 5.43 | 40.6 | 24.4 | 22.5 | 20.9 | 22 | 27.9 | 51.8 | 50.1 | 27.1 | |
| Texas | 11.1 | 7.39 | 5.99 | 35.8 | 26 | 21.8 | 21.6 | 22.4 | 29.3 | 47.6 | 62.3 | 26.5 |
| Vermont | 28.5 | 21.1 | 14.3 | 12.5 | 19.2 | 33.2 | 48.9 | 25.4 | ||||
| Wyoming | 38.4 | 17.7 | 17.3 | 30.3 | 16.2 | 24 | ||||||
| Washington | 4.24 | 4.71 | 29.9 | 17.4 | 16 | 16.7 | 18.5 | 23.4 | 48.9 | 56.9 | 23.7 | |
| Minnesota | 4.42 | 4.43 | 26.2 | 18.3 | 14 | 14.4 | 14.9 | 23.6 | 50.8 | 60.4 | 23.1 | |
| Arkansas | 8.06 | 6.31 | 29.5 | 22.2 | 18.6 | 22.4 | 16.1 | 21.3 | 34.7 | 49.9 | 22.9 | |
| Rhode Island | 13.5 | 6.33 | 12.3 | 28.9 | 49.9 | 22.2 | ||||||
| Kentucky | 4.14 | 3.64 | 24.6 | 18.6 | 16.8 | 14.4 | 14.3 | 20.6 | 40.5 | 38.9 | 19.7 | |
| Maryland | 1.98 | 20.6 | 16.1 | 9.26 | 12.5 | 13.2 | 18.5 | 29.4 | 37.8 | 17.7 | ||
| West Virginia | 17.9 | 19.2 | 10 | 11.6 | 9.24 | 14.2 | 23.5 | 28 | 16.7 | |||
| Delaware | 18.2 | 9.44 | 12.3 | 16.9 | 26.5 | 16.7 | ||||||
| South Carolina | 2.59 | 16.4 | 14.9 | 12.9 | 11.6 | 12.4 | 15.9 | 29.9 | 31.9 | 16.5 | ||
| Indiana | 3.46 | 2.66 | 17.9 | 11.6 | 11 | 10.6 | 10.4 | 18 | 34.3 | 43.1 | 16.3 | |
| California | 4.44 | 3.23 | 3.37 | 21 | 16.4 | 12 | 13.2 | 14.8 | 17.1 | 29.4 | 38.5 | 15.8 |
| Ohio | 1.81 | 13.9 | 10.4 | 9.45 | 9.44 | 10.6 | 16.8 | 27.5 | 37.8 | 15.3 | ||
| Nevada | 12.5 | 9.27 | 8.98 | 6.93 | 9.05 | 14.4 | 25.3 | 32.7 | 14.9 | |||
| Alabama | 2.26 | 16.3 | 14 | 12.2 | 11.3 | 11.5 | 15.3 | 24.4 | 20.2 | 14.2 | ||
| Virginia | 1.43 | 12 | 8.61 | 7.53 | 8.4 | 10 | 12.3 | 28.5 | 35.5 | 13.8 | ||
| Michigan | 2.45 | 14.2 | 7.93 | 7.68 | 8.05 | 8.01 | 12 | 26.4 | 34.8 | 13.5 | ||
| Illinois | 6.4 | 2.47 | 1.69 | 17.7 | 10.7 | 9.91 | 9.3 | 10.6 | 15 | 26.7 | 28.3 | 12.6 |
| Georgia | 1.99 | 2.23 | 12.8 | 9.49 | 9.5 | 8.61 | 12 | 15.8 | 25.2 | 27.9 | 12.6 | |
| New Hampshire | 9.94 | 9.41 | 8.6 | 9.54 | 11.8 | 12.5 | 23.3 | 12.2 | ||||
| Colorado | 3.79 | 2.38 | 13.6 | 8.71 | 7.53 | 7.46 | 8.87 | 12 | 20.6 | 32.3 | 11.7 | |
| Oklahoma | 9.66 | 7.68 | 6.48 | 6.34 | 6.9 | 9.83 | 15.4 | 16.6 | 9.85 | |||
| New York | 1.45 | 1.14 | 10.1 | 5.64 | 4.41 | 5.17 | 6.03 | 10.3 | 22.9 | 26.4 | 9.35 | |
| Utah | 10 | 8.14 | 7.66 | 7.43 | 7.89 | 8.92 | 15.4 | 9.35 | ||||
| Louisiana | 3.65 | 1.52 | 7.9 | 8.57 | 8.11 | 7.04 | 6.01 | 7.04 | 13.3 | 20.8 | 8.4 | |
| Pennsylvania | 0.96 | 8.85 | 5.53 | 5.6 | 5 | 5.6 | 7.66 | 13.9 | 16.5 | 7.74 | ||
| New Jersey | 7.39 | 5.27 | 3.53 | 3.76 | 4.81 | 7.62 | 13.5 | 12.9 | 7.34 | |||
| District of Columbia | 7.09 | 7.09 | ||||||||||
| Hawaii | 7.46 | 4.84 | 5.53 | 5.26 | 5.77 | |||||||
| Arizona | 5.17 | 4.79 | 4.24 | 4.1 | 3.56 | 6.14 | 9.07 | 5.3 | ||||
| Massachusetts | 2.55 | 1.61 | 1.47 | 2.11 | 2.59 | 4.63 | 9.03 | 9.64 | 4.2 | |||
| Connecticut | 2.15 | 2.12 | 5.54 | 3.27 |
I wanted to visualize the data as a choropleth, which we can see below.
ggplotly(p)
I thought about speeding laws and found that Idaho, Maine, North Dakota, South Dakota, Texas and Wyoming all have speed limits of 75-80 on certain highways. However, there doesn’t seem to be much of a connection and it doesn’t explain why Mississippi’s vehicle mortality rate is so high.
I researched a bit on state seatbelts and found that seatbelt laws are broken up into two categories: primary and secondary. Some states have primary seat belt laws where law enforcement can stop and fine vehicles passengers for not wearing a seatbelt. Other states have secondary seat belt laws where law enforcement cannot stop you solely because you aren’t wearing a seatbelt. I used data from the Insurance Institute for Highway Safety (IIHS) to look at each state’s laws. However, I found that many of the states with the highest vehicle mortalities already had enacted primary seatbelt laws. This suggests that there is something else to consider here.
seatbelt$fips <-fips(seatbelt$State)
seatbelt$fips
## [1] "01" "02" "04" "05" "06" "08" "09" "10" "11" "12" "13" "15" "16" "17" "18"
## [16] "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" "32" "33"
## [31] "34" "35" "36" "37" "38" "39" "40" "41" "42" "44" "45" "46" "47" "48" "49"
## [46] "50" "51" "53" "54" "55" "56"
plot_usmap(data = seatbelt, values = "YesNo")+
scale_fill_continuous(low="white", high="#00008B",
name = "primary or secondary seatbelt law", label = scales::comma
) +
theme(legend.position = "bottom") +
theme(panel.background = element_rect(colour = "black")) +
labs(title = "Seatbelt Law by State")
I found some data on road conditions from the Office of Highway Policy Information and decided to look at the ratio of rural to urban roads per state. I decided to conduct a linear regression to see if percent rural road might have any connection to vehicle mortality rates. I also conducted a multi linear regression holding state seatbelt laws constant.
mort_rural <- lm( overall_mortality~ percent_rural, data = rural_mean)
mort_rural
##
## Call:
## lm(formula = overall_mortality ~ percent_rural, data = rural_mean)
##
## Coefficients:
## (Intercept) percent_rural
## -4.7542 0.4318
The first coefficient, -4.7, is the y-intercept or what the mortality would be if the percent rural was 0. The second coefficient, 0.43, is the slope or what the increase in mortality would be for each additional increase in percent rural. According to this model, the mortality would increase by 0.43 for each 1 percent increase in rural roads.
ggplot(rural_mean, aes(x = percent_rural, y = overall_mortality)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 5 rows containing non-finite values (stat_smooth).
## Warning: Removed 5 rows containing missing values (geom_point).
If we look back at the table, the data point that just about at 80 on the y-axis is Mississippi.
confint(mort_rural)
## 2.5 % 97.5 %
## (Intercept) -18.0229704 8.5146629
## percent_rural 0.2422393 0.6212997
We can say with 95% confidence that the true value of the slope is between 0.24 and 0.62 per 1 percent increase in rural roads out of total roads. This means that the percentage of rural roads is a significant predictor of vehicle mortality.
summary(mort_rural)$r.squared
## [1] 0.3088588
31% of the variation in vehicle mortality can be explained by state’s rural road percentage.
mlr<- lm(overall_mortality~ percent_rural +YesNo, data=rural_mean_seatbelt)
mlr
##
## Call:
## lm(formula = overall_mortality ~ percent_rural + YesNo, data = rural_mean_seatbelt)
##
## Coefficients:
## (Intercept) percent_rural YesNo
## -9.2329 0.4557 4.2773
coef(mlr)
## (Intercept) percent_rural YesNo
## -9.2329257 0.4557438 4.2773109
confint(mlr)
## 2.5 % 97.5 %
## (Intercept) -25.4960525 7.0302010
## percent_rural 0.2594256 0.6520619
## YesNo -4.6785474 13.2331691
Controlling for seatbelt laws, the vehicle mortality rate for states is 0.46 times higher each time the percent of rural roads increases by one (95% CI: 0.26 - 0.65)
In the SLR model, th coefficient for rural roads percent is 0.43, while in this model (MLR) the coefficient is 0.46, which is higher This is because predictor variables, such as seatbelt laws are correlated and only looking at one of them at a time can hide the true association. This is also called confounding.
plot(mlr, which = 1)
The plot (the red line) deviates from the straight line at y=0. The red line is more curved, which suggests that the higher exponents of x (cubic, quadratic, etc) might fit this data better than a linear regression.
In conclusion, the analysis suggests that something about rural roads leads to higher vehicle moralities on average. My hypothesis about states with more urban areas having higher vehicle moralities was wrong and the opposite it true instead. However, only roughly 30% of the mortality rates can be explained by this, so there is more going on here. We can still brainstorm some recommendations based off of this. I suggest that speed limits be lowered and enforced on rural roads considering that the wide open road might not have many cars on the road which may lead to drivers speeding more and crashing at higher speeds. I would also suggest that more public transit infrastructure be developed in rural areas. One of the biggest differences between urban and rural areas is that urban areas are close together and often have public transport. This is a costly investment and may not be possible for all states.